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ABSTRACT 

A fully Sinc-Galerkin method for the numerical recovery of spatially varying diffusion 
coefficients in linear parabolic partial differential equations is presented. Because the pa- 
rameter recovery problems are inherently ill-posed, an output error criterion in conjunction 
with Tikhonov regularization is used to formulate them as infinite-dimensional minimization 
problems. The forward problems are discretized with a sine basis in both the spatial and 
temporal domains thus yielding an approximate solution which displays an exponential con- 
vergence rate and is valid on the infinite time interval. The minimization problems are then 
solved via a quasi-Newton/trust region algorithm. The L-curve technique for determining 
an appropriate value of the regularization parameter is briefly discussed, and numerical ex- 
amples are given which demonstrate the applicability of the method both for problems with 
noise-free data as well as for those whose data contains white noise. 


^his research was supported by the National Aeronautics and Space Administration under NASA Con- 
tract No. NAS1-18605 while the author was in residence at the Institute for Computer Applications in 
Science and Engineering (ICASE), NASA Langley Research Center, Hampton, VA 23665. 
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1 Introduction 


In this paper, a fully Sinc-Galerkin method is introduced for the numerical recovery of 
variable diffusion coefficients in linear parabolic partial differential equations. To illustrate 
the method, consider the problem of estimating the spatially varying parameter p(x) in the 
diffusion equation 

£(p)u = 0<x<1 t>0 

u(0,t) = u(l,t) = 0, *>0 (1.1) 

u(x , 0) = 0, 0 < x < 1 

given measurements of the data at the points in (0, 1) x 1R + . As noted in 

[1], problems of this type arise in applications ranging from physiological modeling to sea 
sediment analysis. 

For many applications, it is physically reasonable to assume that p is continuous on [0, 1] 
and to let the admissible parameter set Q be defined by 

Q = {pe H\ 0, 1) : p(x) >p 0 > 0}. 

With this definition, the existence of a unique solution u to the forward problem can be 
obtained on any fixed time interval, (0,r],r > 0, for / sufficiently smooth. 

The objective of the parameter recovery problem is to choose p* € Q so that the solution 
u* of (1.1) corresponding to p* agrees with the true state u. In general however, the true 
state u is not known and measurements are taken instead from an observation space Z. In 
this paper, the data are taken to be point evaluations and the observation space Z is defined 
to be Z = JR n p‘ n «. The observation operator C : C((0, 1) x (0,r]) — ► Z is then given by 

c* = w*„ m 

The “idealized” recovery problem may then be formulated as follows: determine p 6 Q so 
that 

^ u {'i ') p) — d 
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where d is used to denote the data. Since the forward problem is well-posed, the parameter 
recovery problem may be formulated as 

K(p) = d (1.3) 

where the nonlinear operator JC is defined by 

K(p) = CC-'{j>)f. 

The problem (1.3) is impractical to solve for several reasons. As indicated in [12], the 
problem is ill-posed in the sense that solutions p (provided they exist) may not depend 
continuously on the data d. Hence, discretized versions of this problem are likely to be highly 
ill-conditioned. Consequently, some sort of regularization (i.e., stabilization) is required to 
obtain an accurate approximation for p. 

The regularization technique that is used is Tikhonov regularization [19] and the problem 
(1.3) is replaced by the minimization problem 

minT a (p) (1.4) 

P€Q 

where the Tikhonov functional is 

Ur)=\{\\ K(p)-<r|| ! + aJ(p)). 

Here a > 0 is a regularization parameter, which controls the tradeoff between goodness 
of fit to the data and stability. The penalty functional J(p) provides stability and allows 
the inclusion of a priori information about the true parameter p*. Since the parameter is 
assumed to be “smooth” in the sense that p € #*(0,1), the penalty functional is taken to 
be the norm 

J(p) = IIpIIq = J Q \p'{x)] 2 v(x)dx + e J q \p{x)) 2 v{x)dx. (1.5) 

The parameter e is of the order 10~ 6 and the weight v is taken to be the positive function 
u(x) = x(l — x). The reasons for weighting the integral as well as including the second 
term and enforcing J(p) to be strictly positive will be discussed in the fourth section of this 
paper. By using arguments similar to those in [8] and [15] and assuming that K.{p ) is one to 
one, it can be shown that with this definition for J(p ), the solutions p a to (1.4) converge as 
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the regularization parameter a — » 0 and as the perturbations in the data and operator tend 
to zero. 

Due to the infinite dimensionality of Q and that of the state space, the problem (1.4) 
is an infinite-dimensional minimization problem. In order to develop a practical numerical 
scheme, the problem must be replaced by a sequence of finite-dimensional problems; that 
is, one must approximate the operator K, and minimize the functional T a over a finite- 
dimensional admissible subspace of Q. 

The evaluation of IC(p) requires the solution of the partial differential equation (1.1). 
Similar PDE’s must be solved to obtain the components of the derivative tC'(p). The con- 
struction of an approximate solution to these forward problems commonly begins with a 
Galerkin discretization of the spatial variable with time-dependent coefficients. This yields a 
system of ordinary differential equations which is solved via differencing techniques. Due to 
stability constraints on the discrete evolution operator, low-order methods with small time 
steps are often required to obtain accurate approximations. Moreover, this time-stepping 
must be repeated at each step in the minimization of (1.4). A final difficulty lies in the need 
to interpolate at data points which do not coincide with the nodes of the ODE solver. 

In contrast, the method of this work implements a Galerkin scheme in time as well as 
space thus bypassing many of the difficulties associated with time-stepping methods in the 
context of inverse problems. This possibility was first explored in [12]. In contrast to the 
methods of that work however, both the spatial and temporal basis functions are taken to 
be compositions of sine functions with suitable conformal maps. 

By discretizing the forward problem in this manner, the optimal exponential convergence 
rate is exhibited throughout the infinite time domain, even in the presence of boundary 
singularities. The validity and exponential convergence rate of the approximate solution 
throughout all time is especially important in those problems in which the data is sampled 
at large temporal values t q . Furthermore, the sine quadrature rules yield coefficient matrices 
which are efficiently constructed for the forward problem and easily updated when the for- 
ward techniques are employed in a parameter recovery scheme. The efficiency of the inverse 
scheme is further augmented by the fact that the component matrices used in formulating 
the finite-dimensional penalty functional are identical to those used when constructing the 
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forward coefficient matrices and hence need to be formed only once. The efficiency and 
accuracy of the forward solver and the ease of formulating the penalty functional are then 
manifested in the inverse algorithm for a large class of problems. 

The foundations of the Sinc-Galerkin method and the fundamental quadrature rules are 
described in Section 2. A thorough review of sine function properties can be found in [17] and 
[18]. In the third section of this paper, the Sinc-Galerkin system for the forward problem is 
constructed and implementation details are discussed. The section closes with the discussion 
of a very robust and accurate algorithm for solving the resulting algebraic system. Section 4 
includes the finite-dimensional minimization problem with the discussion centering around 
the construction of the various components of the Tikhonov functional. By taking advantage 
of sine function properties, efficient routines for approximating the nonlinear operator K(p) 
and the penalty functional ) are developed. In the next section, a quasi-Newton/trust 
region scheme is outlined for solving the finite-dimensional minimization problem. The paper 
concludes with a section containing numerical examples. Of the many examples tested, those 
discussed in this section best exhibit the features necessary for the practical implementation 
of the Sinc-Galerkin method. A brief discussion of the Generalized Cross Validation (GCV) 
and Z-curve techniques for choosing the regularization parameter a is given at the beginning 
of the section, and the applicability of these techniques in conjunction with the Sinc-Galerkin 
method is demonstrated by the numerical results. Finally, results are included both from 
data sets with white noise and from sets to which no noise was added. As shown in these 
examples, the Sinc-Galerkin method works equally well in both cases. 
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2 Sine Function Properties 


For the Sinc-Galerkin method, the basis functions are derived from the Whittaker cardinal 
(sine) function 

. , x sin(7rx) 

sinc(x) = — - — -, —oo < x < oo 

nx 

and its translates 

S(k, h)(x ) = sine J~~ \ > h > 0. 

For h* = three adjacent members of this sine family (S(k,h*)(x),k = —1,0, 1) are shown 
in Figure 1. 



-6 -4 -2 0 2 4 6 

x-axis 

Figure 1. Three Adjacent Members (£■(&, h*)(x), k = — 1,0, l,h* = of the Translated Sine 
Family. 

To construct basis functions on the intervals (0,1) and (0, oo), respectively, consider the 
conformal maps 

<K Z ) = ln (y—;) (2-1) 
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and 

T(u>) = ln(w). 

(2.2) 

The map 4> carries the eye-shaped region 



D e = jz - x + ty : | arg Q _ z )| < d < -} 

(2.3) 

onto the infinite strip 

Ds = U = C + «? : M < d ^ \)- 

(2.4) 

Similarly, the map T 

carries the infinite wedge 



Dw = {w-t + is: |ar< 7 (u;)| < d < 

(2.5) 


onto the strip Ds . These regions are depicted in Figure 2. 







Figure 2. The Domains Ds,De, and Dw- 
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The sine gridpoints z * € (0, 1) in De will be denoted x* since they are real. Similarly, the 
gridpoints Wk € (0, oo) in Dw will be denoted tk . Both are inverse images of the equispaced 
grid in Ds ; that is, 

~kh 

(2.6) 


Xk = <f> l (kh) = 


and 


1 + e kh 
tk = 'T~ 1 (kh) = e kh . 


(2.7) 


To simplify notation throughout the remainder of this section, the pairs </>, De and T , Dw 
are referred to generically as x> D. It is understood that the subsequent definition, theorems, 
and identities hold in either setting. Furthermore, the inverse of x is denoted by t/>. 

The important class of functions for sine interpolation and quadrature is denoted B(D) 
and defined next. 

Definition 2.1. Let B(D) be the class of functions F which are analytic in D, satisfy 


f \F(z)dz\ — ► 0, t —* dtoo 
U{t+L) 

where L = {is : |s| < d < and on the boundary of D (denoted dD) satisfy 

n(f) = J sd w*)<fei < <*>• 

The following theorem for functions in B(D) is found in [16]. 


Theorem 2.1. Let T be (0,1) or (0,oo) when x = <f> or T, respectively. If F 6 B(D) and 
Zj = i}>(jh ) = X~\jh), j = 0, ±1, ±2, • • • , then for h > 0 sufficiently small 


J r F{z)dz 


n*i) 


h S'oo X'(*i) 


< I< x e 


—2 ird/h 


( 2 . 8 ) 


Theorem 2.1 illustrates the exponential convergence rate which is a trademark of the sine 
methods. There is a common occasion when it is possible to evaluate the infinite series 
appearing in (2.8), namely when integrating against S(k,h) o x- In general, however, the 
series must be truncated. With additional hypotheses, it is proven in [11] and [17] that the 
truncation need not be at the expense of the exponential convergence. 
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Theorem 2.2. Assume F € B(D) and that there exist positive constants K,a, and /? such 
that 

I nr) I 


Ix'M 

Then for h sufficiently small 


<I<< 


e t G ^((-oo,0)) 

e -0lx(T)|^ r € V*([0> °°))- 


(2.9) 




< K ie - M ' h + — e ~ aMh + 

a p 


( 2 . 10 ) 


Theorems 2.1 and 2.2 are used to establish a uniform error bound when constructing an 
approximate solution to the forward second-order time-dependent problems. The application 
of these quadrature theorems is facilitated by the identities 


and 


C ■ [S(P, h) o *(»)] 


= < 


Z=Zi 


- * 


J fyS(p,h)°x{ z ) 


= < 


ZZ=Zi 


1. * = p 
0, i f p, 

0, i = p 

(-i y- p 


{ (i-p) 


■» * # P 


= >y 


_£ 

d x 


2 S{p,h)o X (z) 


7T 

T’ 


% — p 


i+v 


( 2 . 11 ) 


( 2 . 12 ) 


(2.13) 


(< - vY 

which denote the evaluation at the gridpoint z\ of the sinc-map compositions and their 
derivatives with respect to the map x- 
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3 The Forward Problem 


Consider the second-order parabolic problem 

Cu(x,t ) = ^ = 0 < ^ a; ' < ^l) t > 0 

u(0,t) = u(M) = 0, t> 0 (3.1) 

u(x,0) = 0, 0 < x < 1 . 

To define the Sinc-Galerkin approximation to (3.1), let 5,(x) = S(i,h x ) o <^>(x) and 
Sj(t) = S(j,ht) o T(<), and take the basis to be where 

^(x,<) = 5,(x)5;(0. 

The approximate solution is then defined by way of the tensor product expansion 


n* n, m x = M x + N x -f- 1 

u m x m t { x yi) = 0 > ( 3 - 2 ) 

i= ~ M * i= ~ Mt m t = M t + N t + 1 . 

The m x • mi unknown coefficients {u,j} are determined by orthogonalizing the residual with 
respect to the set of sine functions {S p S*}IZZm x ,"-'!n t - This yields the discrete Galerkin 
system 

(£u mi mi — fiSpSg) = 0 (3-3) 

for p = —M x , ■ • • , N x and q = —M t , • ■ • , N t . The inner product (■, •) is taken to be 

(F, G) = f f F(x,t)G(x,t)w(x,t)dxdt (3.4) 

Jo Jo 

with the weight 

w(x,t) = u>(x)u;*(f) = (^ , (x)) _ »(T (t))$ . (3-5) 

A thorough discussion motivating this choice of weight can be found in [10] and [13], 

Because of the tensor nature of the approximate solution, the domain on which (3.1) is 
posed, and the form of the inner product, the discrete system (3.3) can be formulated by 
combining the discrete systems corresponding to the one-dimensional problems 


u(<) = r(t ) , 0 < t < oo 

i/(0) = 0 


(3.6) 
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and 


(3.7) 


(p(x)u'(x))' = r(x) , 0 < x < 1 
u(0) = u(l) = 0 . 

This latter approach also illustrates the sine parameter selections which are needed when 
implementing the method. 

Continuing with (3.6), a discrete system is formed by orthogonalizing the residual 
with respect to Before invoking the quadrature rules, integra- 

tion by parts is used to transfer the differentiation of u onto S*V T, where again, w* = VT 
denotes the temporal inner product weight. To guarantee that the boundary terms vanish, 
it is assumed that 

<— *0+ y/t y/t 

Finally, the resulting integrals are evaluated via (2.10) or (2.8) when possible. With respect 
to (2.9), the condition 

r, t e (0,1) 




< L 


t~ s , t € [l,oo) 


guarantees the boundedness necessary to truncate the infinite quadrature rule. With 7 and 
8 specified and M t chosen, the parameter selections 


h t = 


I 7T d 

7 Mt 


and 

N t = + l] (3.8) 

where [*] denotes the greatest integer function, balance the asymptotic quadrature errors in 
(2.10) to at least 

In many parabolic systems, it is reasonable to assume that the solution decays exponen- 
tially at infinity, that is that the solution satisfies 

| u (*)\/^) I ^ L 


r, ie(0,l) 

e ~ 6t , fc[l,oo) 
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or, more succinctly, 


\u(t)\ <KV+h~ 6 K (3.9) 

Under this supposition, Lund [11] shows that the condition (3.8) can be replaced by 

Nt = + ij . (3.10) 

The selection Nt in (3.10) significantly reduces the size of the discrete system with no loss 
of accuracy. It is also noted that the size of the discrete system and the expected error are 
dictated by the asymptotic behavior of u at the endpoints. 

The discrete system for (3.6) can then be formulated as follows. Let 7^, l = 0, 1 denote 
the m t x m t matrices whose g;'-th entry is 6$ from (2.11) and (2.12) and let V(j]) be the di- 
agonal matrix with entries • • • , The vector of unknowns u = • ■ • , u/v,] T 

is then related to the known vector r = [r(t_^,), • ■ • ,r(<w,)] T by 

A t u = 2?((T)“a)r (3.11) 


where 


A t = 


—/<*> + i/] C((T)i). 


(3.12) 


Further details concerning the derivation of the system (3.11) can be found in [10] and a 
thorough analysis of the spectrum of A t is given in [3]. 

The preceding discussion applied to the problem (3.7) follows a similar development. The 
map T of (2.2) is replaced by the map <f> of (2.1) (since (3.7) is posed in the interval (0,1)) 
and ht is replaced by h x . Orthogonalizing the residual and two integrations by parts yields 
the system 

/T ' 


/ «(*) 

Jo 


p{x) S p (x) 


1 




dx 


= / r(x)S p (x) ■ 7 = dx (3.13) 

Jo ’few 


for p = —M x , • ■ • , N x . To guarantee that the boundary terms vanish, it is assumed that 

ii 


r'^)Ho=W^)) w 


= o. 


In anticipation of the parameter recovery problem which motivates this analysis, the term 
p(x) in (3.13) is expanded as a linear combination of sine functions with two Hermite-like 
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algebraic terms added to accommodate the nonzero values of p at x = 0 and x = 1. The 
finite-dimensional approximation of p then takes the form 




N x -l 

C_M,(1 - Z) + C N ,X + Y C kSk{x) 

k=-M*+l 

Y c k (k(x) . 

k——Mi 


(3.14) 


In the forward problem, the coefficients {c k }k=-M, are known whereas in the corresponding 
parameter recovery problem, they are unknown and are determined via methods to be dis- 
cussed in Section 4. The number of basis functions used in the expansion is chosen so as to 
guarantee a square coefficient matrix. This is done to simplify the implementation of the 
method when applied to the PDE (3.1) of interest. 

The expansion (3.14) is substituted into (3.13) and the resulting integrals are evaluated 
via (2.10) or (2.8) when possible. As shown in [13], the decay condition (2.9) equates to the 
condition 


|u(a:)P(x)| < L 


x a+ h, xG(0,|) 

(l-x)^H, *e[§,i) 


where 

P(x) = p(x) - p(0)(l - x) - p(l)x. 


This may be replaced by the more stringent condition 


\u(x)P{x)\< tfx" + *(l-x) /H *. 


(3.15) 


The asymptotic errors are then balanced by the choices 


hx — 



and 


a 


N s = yM x + l 


where [] again denotes the greatest integer function. Note that if |M X is an integer, this 
integer can be selected for N x • 
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With u,r, and I^\ i = 0, 1,2, defined as before, the system for (3.6) may be written as 


A(p)u = V((<f>')~$)r 


(3.16) 


where 


m 


■J-/W _ V (^)] vm-'Wr) ■ 


(3.17) 


The notation T>{p^) and 2}(py,<) denotes the diagonal matrices containing the components of 
the vectors jty and p^> which are defined as follows. First 


P4, = tfc 

where c = [c_Af,, • • • , c^,] T and ^ has the block structure 

* = : / <0) : lfoUx™. 

with 

0L = [(1 - X-M x ), ’••,(!- ^W,)] T 

and 

V’fl = [ x -M X '> • ' • > ®A^] T - 

Again, the m, x (m, — 2) matrix J(°) has components 6$ as defined in (2.11) with 
—M x < q < N x and —M x + 1 < j < N x — 1. Also, 

Pv = 'k'c 

where 

*' = l-r i - ; ikx„.. 

Here 1 = [1, ■ • • , 1] T , 'D(<f>') is m x x m„ and /W is m, x (m x — 2) with components <5^ as 
defined in (2.12). 

As shown in [13], the system (3.16) yields an approximate solution which exhibits expo- 
nential convergence to the solution u of (3.7). Further details concerning the derivation of 
the system as well as additional quadrature hypotheses can be found in this reference. 
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The above results for the one- dimensional problems (3.6) and (3.7) can then be pieced 
together to form the Sinc-Galerkin system for the time-dependent parabolic problem (3.1). 
The resulting discrete system is built from the matrices A t (in (3.12)) and A(p) (in (3.17)) 
of the one-dimensional problems. The parameter selections are still necessary and all that 
remains is to asymptotically balance the resulting errors from each one-dimensional problem. 

When the decay conditions (3.9) and (3.15) are combined to yield 

\P{x)u{x,t)\ < Kx a+ l{ 1 - xf+lp+h-**, (3.18) 


then the choices 



a 


Nx m U Mx + i ’ 


a 

~ It 


M t = + 


and 

for the stepsizes and summation limits balance the asymptotic errors. If one takes d = then 
the above choices yield an asymptotic error rate of order for the quadratures. 

Given M x ,N x1 M t ,N t , and h = h x = h t as defined above, the discrete system for (3.1) is 


A(p)UV ((f)-*) + V ((^)-t) UAj = G (3.19) 


where 

G = V (W 1 ) FV ((f )-») . 

The diagonal matrices V ((<^')“*) and ^ ((f)"’) have sizes x m * and m t x m u respec- 
tively. The m x x m t matrices U and F contain the unknowns {tqj} and the known values 

The discrete Sinc-Galerkin system (3.19) can then be solved for U via a generalized Schur 
decomposition (page 396 of [6]). As guaranteed by the results of Moler and Stewart [14], 
there exist unitary matrices Q\,Z\,Q 2 y and Z 2 such that 
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Q\A( V )Z , = P 
Q‘iV ((*')-*) Zi = R 
<j;o((f)-i)Zj = 5 
QlA,Zi = T 


where P,R,S, and T are upper triangular. If Y = Z{UZ 2 and C = Q\GQ 2 , then (3.19) 
transforms to 


PYT* + RYS * = C. 


By comparing the A:-th columns, one finds that 


P J2 tkjVj + RJ2 ^iVi = Cfc 

}=k j=k 


which yields 

(t kk P + s kk R)y k = c k - P tkjtfj ~ R J2 s kjVj (3-20) 

j=k+ l j=k + 1 

(for convenience, it is assumed that all matrices are n x n and indexed from 1 to n). With 
the assumption that the matrix (t kk P -f s kk R) is nonsingular, the solution to (3.20) is easily 
found by recursively solving triangular systems. 

Although this algorithm does require complex algebra, it is quite efficient and requires no 
assumptions concerning the diagonalizability of the component matrices. It should be noted 
that a “real” version of this algorithm also exists [5]. In this latter algorithm, Q\, Zi,Q 2 , and 
Z 2 are orthogonal with P, S quasi-upper triangular and R, T upper triangular. As proven 
in [5], the real algorithm is extremely stable and numerical tests have indicated that the 
complex algorithm described above is also robust. 
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4 The Finite-Dimensional Minimization Problem 


As noted in the introduction, the minimization problem 


minT.W 


where 

T«(p) = j{||K(p) - i H J + olbllg) (4-1) 

is infinite-dimensional and hence must be replaced by a sequence of finite-dimensional prob- 
lems before a viable numerical scheme may be developed. Following from (3.14), the ap- 
proximating admissible parameter sets are taken to be 

I Nl 

Qm x = S Pmi '• Pm, ( J ) = ^ > c kCk(x) 

l k=-M x 


where 


1 — x, k — —M x 


Ck{x) 


\ S k {x), —M x + l<k<N x -l 


x, k = N x 


(4.2) 


and Sk(x ) = S(k,h x ) o <f>(x). The associated finite-dimensional optimization problem can 
then be formulated as 


mm T a {p mx ) 

Pmi tWmjr 


(4.3) 


for 

T a {p mx ) = 2 {ll^(/Vr) - <^|| 2 + “llPmJlc} • ( 4 - 4 ) 

The approximation K(p mx ) : 2R m * -* JR n * n * to K(p) is obtained by applying the point eval- 
uation operator C in (1.2) to u mxTn , in (3.2). If the set of observation points {(#*>» ^)}p=i’.. ’np 
can be represented as a tensor product of spatial and temporal points, then K(p mx ) has the 
representation 

K(p mx ) = Cc4U) (4.5) 

where the matrix U solves (3.19). The matrix concatenation co(U ) is the vector in 
which is obtained by successively stacking the columns of the m x x m* matrix U. C is an 
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(n P 'T* g ) x (m x -m t ) evaluation matrix which can be formulated as follows. Define the n p x m x 
spatial evaluation matrix E x to have components 

[•fi'xjp.i = *-*»(‘*'p)j 1 P ^ Tlpf — M x ^ t 5: N m 

and define the n q x roj temporal evaluation matrix Et to have components 


[E t ) Vtj = s;(t q ), 1 < q < n„ -M t <j< N t . 


Then 

C -E t ® E x . 

It is noted that if the set of observation points is not rectangular as described above, then 
point evaluation can be done directly via (3.2). This latter option is less efficient however, 
than that defined in (4.5). 

The discrete penalty functional ||pm,||g is formed by substituting the expansion (3.14) 
into the definition (1.5). This yields 

l|Pm,||g= f \p' mx { x )] 2 v(T)d x + e f [p mt (x)] 2 v(x)dx « c T Qc 
«/0 JO 

where the m, x m x matrix Q = Q* -f Qj has components 

[Qdlkt ~ t C' k {x)(' t (x)v(x)dx, —M x < k,t < N x 
Jo 

and 

[Qj]kt « £ f (k(x)(t{x)v(x)dx, -M x < k,£ < N x . 

JO 

The matrix entries are approximations in the sense that sine quadrature is used to evaluate 
many of the integrals. 

For the choice of basis functions in (4.2), the matrix Qj is given by 


Q d = 


Qd 


<i Id 

A 

Qd 

- 1 
-q d 


~Qd 

i 

6 
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Integration by parts and the application of the sine quadrature formula (2.10) yields the 
(m x - 2) x ( m x - 2) matrix 

Qd = 

where again denotes the matrix whose qfj-th entry is from (2.13). The zeroing of all 

other quadrature terms is a result of the choice v(x) = = x(l — x ). The (m, — 2) x 1 

vector qd has components 

[q d ]k = h x (x k - 3 x\ 4- 2*J), -M x + l<k<N x -l 

and is again obtained via sine quadrature. Because 1^ is negative definite (see [16]), the 
matrix Qd is nonnegative definite. The zero eigenvalue results from the fact that the first 

and last columns of Qd differ only in sign. 

Direct integration and sine quadrature are also used to obtain the matrix 

l => T 1 

20 9// 30 

<Ijl Qf 9/r 

__L z. T J_ 

30 Vf r 20 

Here 

0/ = h x V(x 2 ( 1 - s) 2 ) 

where V{q) again denotes the diagonal matrix with entries q(x-M x ), ■ • • , q(xN x ). The vectors 
qji and qj r have components 

[qji]k = h x xl{\ - x k f 

and 

[qjr]k = h x xl(l - x k ) 2 

for Jfe as -M x + 1, • ■ • , N x - 1. The matrix Q/ is strictly positive definite. 

Although the matrix Q is full, it is very efficient to construct since the Toeplitz matrix 
1 1 2 ) is also needed in the forward solver. For e > 0, Q is symmetric and positive definite and 
hence has a Cholesky decomposition Q = R T R where R is upper triangular. It then follows 
that the penalty term ||p mi ||g yields the quadratic form 

c t R t Rc = || Tie || 2 (4.6) 
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where || • || denotes the Euclidean norm. As will be shown in the next section, this factor- 
ization admits a particularly useful diagonalization of the corresponding finite-dimensional 
minimization problem. It also facilitates the plotting of the L-curve to determine a suitable 
regularization parameter a (see Section 6 and [7]). 


5 The Trust Region Scheme 

In the discussion of this section, it is useful to highlight the dependence of the operators in 
(4.4) on the unknown vector c = [c„m x , • • • , £W,] T (see (3.14)). Letting 

K{c) = K{p mx {c)) = C c^{U) 

and noting (4.6), the optimization problem (4.3) may be replaced by 

min T„(c) (5-1) 

se. R m * 

where 

T„(c) = i{ii/f(c)-<rr +Q ||ij?in. 

To obtain a minimizer for the nonlinear functional T a , a quasi-Newton/trust region scheme 
is used (see [2]). 

The basis for this approach is the iteration 

Cfc+i = Ck + ifc 

where I* solves the constrained minimization problem 

,nun_ i{||/C(c- t ) + K'(c„)s t - I f + o|| fl(c t + 5,011’} (5.2) 
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subject to ||i?ifc|| < 6*. The trust region radius 6k is chosen so that the quadratic model 
adequately reflects the behavior of T a within the trust region; that is, 6 k is chosen so that 

r„(c» + S t ) « i{||Jfft) + A'ftft - d II 1 + a|| Aft + 4)11’} 

whenever \\Rsk\\ < 6k . The minimization problem (5.2) is solved using an approach similar 
to that in [9]. The problem is first diagonalized using the Singular Value Decomposition 
(SVD). With the change of variables 

J = Rs k , 

the objective functional in (5.2) becomes 

i{||/»-6||’ + a||J + jf} 

where A = K'(ck)R~* , T> = d — K{ck) and c = Rc k . Let A have the SVD 

A = UDV t 

where f/( np .n,)x( I ., «,),Kn I xm, are orthogonal and 

cr,-, if i = j and i < m x 
0, otherwise. 

Here <r, denotes a singular value of A. The second change of variables 

J = V T s t b = U T b, c = V T c 


yields the diagonalized problem 

ran i{||DS-4|| J + a||£+»|R (5.3) 

subject to || J || < 6 k . 

The theory of constrained optimization is used to solve (5.3). By the Kuhn-Tucker 
criterion [4], there exists a Lagrange multiplier fi > 0 such that 

D t (Ds — b) + ct(c + s) + fxs = 0. (5.4) 
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FVom (5.4) it follows that (5.3) has a unique solution of the form 


s = s(/i) = {D t D + (a + n)I) 1 (D T b — ac). 

If ||s(0)|| < 6k, then the constraint in (5.3) is not active and s = R -1 1^1(0) solves (5.2); 
otherwise the constraint is active and the solution to (5.2) is given by s = R _1 Vs(/i) where 
ft > 0 is the unique solution to 

M * PUOII - *k = 0. (5.5) 


An approximate solution to the scalar equation (5.5) is then determined via the hook step 
algorithm in [2] (see page 134). This algorithm requires both g(ft) and g'(n). As shown in 
[9], the function g([i) can be expanded as 



(5.6) 


when bj and Cj are components of b and c, respectively. The derivatives g'(fi) are easily 
obtained from the form (5.6). 

The trust region radius 6k in (5.3) is chosen so that T a {c ) has sufficient decrease at each 
iteration so as to guarantee convergence to a local minimizer of T a . This is accomplished via 
the updating algorithm in [2] (page 143) with the decay requirement taken to be 


T a (c k + Sk) < T a (c k ) + eVT a (3k) T s k 


with e = 10~ 4 . 

An important numerical issue in the implementation of the trust region scheme is the 
formulation of the derivation of the operator K. Here the derivative, or Jacobian, is an 
(n p • n,) x m, matrix whose j/-th column is given by 

[*•'(?)]„ = nrai[/C(c + TK) - K{i)] 

where the standard unit vector e„ has components 



1 


0 


if A: = v, —M x < k < N x 
otherwise. 
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In the examples that are presented in Section 6, the Jacobians were calculated with a 
standard forward difference scheme. This scheme is easy to implement and accurate enough 
for the purposes of the method. If further efficiency is desired, a directional derivative 
scheme such as that described in [12] can be used. For this method, the trade-off for the 
added efficiency is an algorithm which is more difficult to implement and a slight loss of 
accuracy in some cases. 


6 Implementation and Numerical Examples 

The four examples reported in this section were selected from a large collection of problems 
to which the Sinc-Galerkin method was applied. The results are representative of those 
obtained on other sample problems. 

The first example demonstrates the application of the Sinc-Galerkin method to a model 
problem in which the state solution was sampled directly; that is, no external noise was added 
to the data. To demonstrate the feasibility of the method for problems with noisy data, 
the same problem is revisited in Example 5.2 but with pseudo-random white noise added 
to the data. In Example 5.3, the parameter to be recovered has a logarithmic boundary 
singularity at x = 0 while the parameter in Example 5.4 is the shifted Gaussian function 
that was considered in [12]. In all four examples, the dynamics of the problem are assumed 
to be modeled by (1.1) with the forcing function f(x,t) consistent with the state solution 
u(x,t) = x(l — x)sin(47rx)f 2 e _1 and the true diffusion parameter p. In each case, d = | (see 
(2.3) and (2.5)). 
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The errors for the method are reported on the set of uniform gridpoints 


U = {zo,2i,"-,Zioo} 


Zk = k£, k = 0 , 1 , •••,100 


( 6 . 1 ) 


with stepsize l = jgg. With p and p m , denoting the true and approximate parameters 
respectively, the errors are reported as 


\\pum = 


max 

(KJK100 




The error and convergence results are tabulated in the form .aaa — 7 which represents 
.aaa x 10 -7 . All problems were run with sixteen place accuracy on a Vax 8550. 

A very important practical consideration is the choice of the regularization parameter a 
for a given (error contaminated) data set. One would like to choose a so that ||p — p a || is 
minimized, where p a denotes the a-dependent unknown diffusion coefficient. If the error in 
the data is random, then under certain conditions (see [20]) the method of Generalized Cross 
Validation (GCV) yields a statistical estimate of the size of ||/C(p) — £(p a )|| which is related 
to ||p — pd|. For Tikhonov regularization, this estimate is given by the GCV functional 


V(a) = 7 


1 


n 


\\K{c a )-df 


n — m 1 


TTl x 




a 


12 


( 6 . 2 ) 


» (3 (<*? + "“)J 

where c a solves (5.1). Here n = n q • n p denotes the number of data points and {<r,} are the 
singular values of the operator A’ / (c Q )i2 -1 . To approximate the value of a which minimizes 
IIP “Poll. one attempts to solve the minimization problem 


min VYa). 

Because the GCV method requires the singular values of K'(c a ), it is relatively expensive 
to implement when m x and n are large. A second disadvantage to this method for choosing 
the regularization parameter is that the GCV plots are often very flat making it difficult to 
determine a minimum value of V (a) and hence an optimal value of a (see Figure 6 in the 
next section). Finally, one often has optimization settings in which the GCV hypotheses are 
not satisfied. 
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A second method for determining the regularization parameter is to plot the norm of 
the penalty functional, ||i£c 0 ||, versus the norm of the residual, ||/'f(c 0( ) — d || (see [7]). In 
this way one can qualitatively get an idea of the compromise between the minimization of 
these two quantities. The scheme for determining the “optimal” regularization parameter 
consists of finding those values of a such that (||/f(c a ) — d ||, ||f7c a ||) lies in the corner 
of the resulting curve, known as the T-curve. This method for choosing the regularization 
parameter ct is easy to apply and often gives more conclusive results than the GCV method. 
Both methods are illustrated in the examples. 

In all four examples, the m x x 1 initial vector cq = [.5, .5, • • • , .5, .5] T was used. Several 
other positive startup vectors were also tried with similar results in each case. Hence the 
method seems to be quite robust with respect to the choice of the initial vector. 

Finally, in the examples the symbol a is used to denote both the regularization parameter 
(see (5.1)) and the sine decay parameter (see (3.18)). The use of this symbol for both 
quantities is well established in the literature and thus difficult to avoid in this setting. It 
should be obvious from the context, however, which quantity is being discussed and there 
should be no ambiguity resulting from the dual use of this symbol. 
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Example 6.1 In this example, the true diffusion parameter is taken to be the analytic 
function p(x ) = 1 + sin(7rx). Since the state solution is u(x,t) = x(l — x) sin(47rx)< 2 e -< , the 
decay condition (3.18) yields the choices a = (3 = 2, 7 = |, and 5=1. The data was sampled 
on a regular grid C (0, 1) x (0,2]. Nine equally spaced points x p = pAx, Ax = .1, 

were taken in space and four equally spaced temporal points t q = <?Af, At = .5, were 
taken for a total of n = 54 data points. No noise was added so the data consisted of direct 
measurements of the state solution. For varying values of the regularization parameter a, the 
L-curve is plotted in Figure 3. Note that the values a = 10 -7 through a = 10 -11 yield points 
(||#(&) — d ||, ||i?c a ||) in the “corner” of the curve. The uniform errors for a = 10 -8 are 
reported in Table 1 with the first four columns indicating the index limits for the expansion 
of the state variable and fifth column indicating the number of basis functions used in the 
expansion of p mx . The convergence of the method is demonstrated both by the results in 
the last column of Table 1 and by Figure 4 which shows the true and approximate diffusion 
parameters with a = 10 -8 . 


M x 

N x 

Mt 

N t 

m x 

IIPuWII 

8 

8 

10 

4 

17 

.7414 - 0 

16 

16 

21 

7 

33 

.7648 - 1 

24 

24 

39 

9 

49 

.3111 - 1 


Table 1. Errors on the Uniform Grid U with a = 10 8 in Example 6.1. 
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Example 0.2 Here the true parameter and state solution are the same as those in 
Example 6.1, and hence p(x ) = 1 + sin(7rx) and u(x,t) = x(l — x) sin(47rx)t 2 e _t . The same 
observation points were used but to this data however, we added a pseudo-random noise 
vector e from a Gaussian distribution with mean 0 and standard deviation a chosen so that 
the noise-to-signal ratio <7/||d|| = 0.001 (noise - 0.1% of the signal). The X-curve and GCV 
curves are plotted in Figures 5 and 6, respectively. Note that the values a = 10 -5 through 
a = 5 x 10 -8 yield points (||/<’(c a ) — d ||, ||i?c 0 ||) in the “corner” of the X-curve whereas 
all values of a less than 10 -5 yield apparent minima of the GCV curve. For M x = 16, the 
uniform errors obtained with a = 10 -3 , a = 10 -6 , and a = 10 -9 are reported in Table 2. 
Corresponding plots of the true and approximate parameters are shown in Figure 7. Note 
that the “corner” value a = 10~ 6 provides a good choice for the regularization parameter 
whereas a = 10 -9 is not large enough to damp out the contribution due to the smaller sin- 
gular values. This latter observation can be predicted from the X-curve but less easily from 
the GCV plot. Finally, the choice a = 10 -3 causes too much smoothing and information 
about the parameter is lost. By comparing the results in Tables 1 and 2, it can be seen that 
the error in this example with a = 10 -6 and M x = 16 is virtually the same as the error in 
Example 6.1 with a = 10 -8 and M x = 16. The results from this example demonstrate the 
viability of the method for problems with noisy data. 



a = lO" 3 a = 10- 6 a = 10" 9 

IM*)I 

.2658 - 0 .7737 - 1 .4357 - 0 


Table 2. Errors on the Uniform Grid U with M x = 16 in Example 6.2. 


27 






x-ixta 

Figure 7. True and Approximate Diffusion Parameters for Example 6.2 with M x = 16 
(a = 10" 3 ), (a = 10" 6 ), •••(<* = lO' 9 ), (True). 

Example 0.3 The true parameter in this example is p(x) = 1 + j x + x/n(x ) which has a log- 
arithmic singularity at x = 0. As before, the state solution is u(x,f) = x(l — x) sin(47rx)£ 2 e -t 
and thus the decay parameters a = /? = 2,7 = and 5=1 are consistent with the condition 
(3.18). To demonstrate the method for another set of observation points, nineteen equally 
spaced points x p = pAx, Ax = .05, were taken in space and four equally spaced temporal 
points t q = qAt , At = .5 were taken for a total of n = 72 data points. No noise was added so 
the data consisted of direct measurements of the state solution. Since the T-curve was nearly 
identical to that of Example 6.1, the regularization parameter was taken to be a = 10 -8 . 
The uniform errors for this choice are reported in Table 3 and the true and approximate pa- 
rameters are shown in Figure 8. Both the table and the figure demonstrate the convergence 
of the method in spite of the logarithmic singularity in the diffusion parameter. 
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M x 

N x 

M t 

N t 

m x 

IMOII 

8 

8 

10 

4 

17 

1.2171-0 

16 

16 

21 

7 

33 

0.1330 - 0 

24 

24 

39 

9 

49 

0.7285 - 1 


Table 3. Errors on the Uniform Grid U with a = 10 8 in Example 6.3. 



Figure 8. True and Approximate Diffusion Parameters for Example 6.3 with a — 10 8 


• - • (M x = 16), {M x = 24), (True). 
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Example 6.4 In this example, the parameter to be recovered is the shifted Gaussian func- 
tion p(ar) = 1 -f ^e -40 ( x- 3h When combined with the state solution, this dictates the 
choices a — /3 = 7 = |, and 6 = 1 for the sine decay parameters as specified by (3.18). 
Pseudo-random noise is added to the data in the manner described in Example 6.2. As 
seen in Figure 9, the Tikhonov parameter values a = 10 -B through a = 10 -8 yield points 
(||A'(c a ) — d ||, \\Rc a \\) in the “corner” of the T-curve. For M x = 16, the uniform errors 
obtained with a = 10 -3 , a = 10~ 8 , and a = 10~ 10 are reported in Table 4 with correspond- 
ing plots of the true and approximate parameters shown in Figure 10. As indicated by the 
numerical results, the “corner” value a = 10 -8 provides a good choice for the regularization 
parameter whereas a = 10 -3 causes too much smoothing. The error contributions due to 
the smaller singular values become quite apparent at a = 10 _1 ° thus reiterating the L-curve 
observation that this Tikhonov value does not provide enough regularization or smoothing 
for the problem. 



p 

11 

►— 1 
0 
1 

u 

P 

II 

1 

O 

t 

00 

P 

II 

H-* 

O 

1 

O 


.7710 - 1 .4109 - 1 .6805 - 1 


Table 4. Errors on the Uniform Grid U with M x = 16 in Example 6.4. 
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0.002 0.004 0.006 0.008 0.01 0.012 0.014 0.016 0.018 0.02 

Residual Norm ||tf(0 — «T|| 

Figure 9. The Tikhonov L-curve for Example 6.4. 



x-axis 


Figure 10. True and Approximate Diffusion Parameters for Example 6.4 with M . 
(a = 10 -3 ), (a = 10 -8 ), •••(« = lO -10 ), (True). 
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